Gridded daily rainfall data for Ghana for the period 1960 - 2015: Approach and validation process

Rainfall data is necessary at resolutions that allow modelling of environmental change and its impact on socio-economic well-being. This is particularly so with agricultural output determination in much of Africa with Ghana not exemption, where data is required for intra and inter-seasonal assessment of the impact of rainfall on yield. However, the sparse and limited distribution data from gauge measurements, coupled with periods of no record of daily rainfall data, limit their application for any meaningful endeavour. This data cures such deficiency by generating a high resolution (0.25° × 0.25°) gridded daily rainfall using the Minimum Surface Curvature (MSC) interpolation for 190 stations distributed across all agro-climatic zones of Ghana. Validation is done using 19 Ghana Meteorological Agency (GMet) gauge stations (10%) by comparing the; ratios, correlations and Root Mean Square Error (RMSE) of the observed to the gridded for the seasons.

Rainfall data is necessary at resolutions that allow modelling of environmental change and its impact on socio-economic well-being. This is particularly so with agricultural output determination in much of Africa with Ghana not exemption, where data is required for intra and inter-seasonal assessment of the impact of rainfall on yield. However, the sparse and limited distribution data from gauge measurements, coupled with periods of no record of daily rainfall data, limit their application for any meaningful endeavour. This data cures such deficiency by generating a high resolution (0.25 °× 0.25 °) gridded daily rainfall using the Minimum Surface Curvature (MSC) interpolation for 190 stations distributed across all agro-climatic zones of Ghana. Validation is done using 19 Ghana Meteorological Agency (GMet) gauge stations (10%) by comparing the; ratios, correlations and Root Mean Square Error (RMSE) of the observed to the gridded for the seasons.  Table   Subject Environmental Science Climatology Specific subject area Meteorology (Daily Rainfall) Type of data Daily rainfall dataset for the period 1960 to 2015 covering the entire country Ghana (Netcdf file) How the data were acquired A total of 190 stations across the agro-climatic zones of Ghana ( Table 2 ) were utilised as the input data. The distribution of the stations over the country is depicted in Fig. 2 and the corresponding number of stations for each zone is outlined in Table 2 . The record for these stations were not complete at all times as there were inherent gaps for some days and months for some stations. This challenge was adequately addressed by the interpolation. A 56-year time frame (1960 -2015) of daily rainfall data for each station results in 20,496 time-steps per station across the entire country. Data homogenisation was done using RHtestsV4 to determine stations with excessive change points which were excluded subsequently [1 , 2] . As detailed in [1] , we performed the homogenisation test to identify change points ( Fig. 1 ). Thereafter, we adjusted the past records, using the magnitude of change points as adjustment factors. Also, as a way of quality controlling the data, spurious records were manually detected and removed from the time series, since their inclusion in the spatial data gridding could produce wrong records and some 'unrealistic' inflections.
The Minimum Surface Curvature (MSC) with tensioning interpolation technique was used to grid the data at a spatial resolution of 0.25 o × 0.25 o . Thereafter, the data was processed into a netcdf file, followed by a point-pixel validation of the gridded dataset with data from unused stations.  [1 , 4] . It is universally applied where smooth approximation and interpolation is carried out for temperature, rainfall, water heads, positional data and potential fields. The MSC has universal application noted for its speed of computation for even large number of points but however is constrained by its complicated algorithm and limited ability to conserve extrapolation trends. Data homogenisation was done using RHtestsV4 to determine stations with excessive change points which were excluded subsequently [1 , 2] . Application of the Minimum Surface Curvature (MSC) interpolation at a 0.25 °× 0.25 °gridding resulted in a final netcdf file of daily rainfall dataset and the final output was validated using a point-pixel assessment [3] . gauge station data obtained from the Ghana Meteorological Agency (GMet) across all the four (4) agro-ecological zones. This constituted the input data for which a daily data set for the 56 year period was generated which compensates for the lost data sets for some periods and some gauge stations under the period of study.
( continued on next page ) Data source location Fig. 2 below shows the location of all gauge stations daily rainfall data used for the interpolation and Table 2

Value of the Data
• Daily rainfall data necessary at resolution 0.25 °× 0.25 °allows the modelling of environmental change and its impact on socio-economic well-being, particularly with agricultural output. • This dataset addresses the sparse and limited distribution of number of point data from -gauge measurements, coupled with periods of norecord of daily rainfall data in Ghana, which limits their application for any meaningful endeavour. • The data is useful for all scientific inquiry requiring data that are specifically generated for the country Ghana as opposed to globally generated datasets which may suffer from the lack of local situational factors and limitations of input data sources [7] . • The data is relevant for crop models that require daily rainfall data for a tropical zone as it can be used to compute varied agro-climatological indices. • Further use of this data may involve assessing the accuracy of the output data file from new validation or accuracy assessment methods or running same input data with a different interpolation method and making the right comparison. • Meteorologist, Agricultural extension officers, Ecologist, Climate scientist, Forestry officials and hydrologist may find this daily rainfall data a much more enhanced and reliable data than gauge data in much of Africa which suffers inconsistencies. • The dataset is very relevant for use by geographers, agriculturalist, foresters, meteorologist, climate scientist, etc. because of the inavailability of long-term daily rainfall data for most parts of the country. This is because, the distribution of gauge data is very sparse and there are mostly long periods of time when the equipment's may be defective and as such unreliable as a data source. This data set provides daily precipitation for every location (point/area) in Ghana, and can be accessed much easily, reducing the burden of researchers in having data for meteorological or climate studies. • This data therefore cures, the data availability problem by ensuring an easy and wide dissemination of much needed dataset for researchers. • Also, the dataset is much more reliable than other publicly available datasets for the reason that, it has a much longer record of about 56 years. Other available datasets have a very short span, which limits their usefulness for long-term application. • The resolution of this dataset is much finer than global precipitation datasets which have very coarse resolution and have been produced without reference to local conditions.

Objective
The prime objective of this project was to develop a concise daily rainfall dataset spanning more than half a century and is reliable for climate studies, agro-meteorology, catchment hydrology and climatology. It provides, a comprehensive gridded data for the entire country, Ghana, at a high resolution of 0.25 °. This means, daily rainfall data can be obtained for every single locality and point with known coordinates. This is very necessary as there are very few working Meteorological gauge stations which have very sparse distribution and very unreliable for research work in some localities [2] .

Data Description
Two sets of data are provided in this work, the input data (Excel Binary) and an output NetCDF file which can be accessed with ARCGIS R software and other Programmes that runs on Linux, Python, etc.
The input data consist of 190 individual daily rainfall measurement from 190 stations distributed across all four (4) agro-ecological stations zones of Ghana ( Fig. 2 ). This primary data was obtained from the Ghana Meteorological Authority (GMet). The list ( Table 1 ) and distribution of the sourced primary data across all agro-ecological zones is provided in Table 2 . These 190 point-data included agro-meteorological, climatological, Synoptic and experimental gauge stations. The 190 selected stations were selected based on a set threshold of not having more than 10% of missing data for the period 1960-2015. They were further sorted and rearranged into Year-Month-Day (YY-MM-DD) and the coordinates appended.
The selected stations were used for the Minimum Surface Curvature (MSC) interpolation [5] at a resolution of 0.25 °× 0.25 °. A point-pixel validation was undertaken to assess the accuracy of the final output file, which has daily rainfall measurement for all of the country Ghana for the 56-year study period. Examples of the utilisation of the final output data is the use to compute and display the number of wet and dry days for every point and locality of Ghana ( Fig. 3 ) and the number of Wet spells ( Fig. 4 ).      ,584  37  70  37  2  Forest transition  57,566  24  27  14  3  Forest  84,897  36  80  42  4  Coastal  8,488  4  13  7  Total 238,535 100 190 100

Experimental Design, Materials and Methods
This data is the output from an interpolation of daily rainfall gauge data for 190 stations across all the four(4) agro-ecological zones of Ghana. It relied on available data from all the stations, though a consistent flaw in the input data was the many years in which there was no data recorded for some stations. This inadequacy is the reason for the interpolation for 56-year period (20,454 days) covering the entirety of the country Ghana.
The process for the generation of the new dataset involved using python scripts for the interpolation (gridding). The specific python scripts used are provided are listed below and can be obtained from; https://github.com/Ampofo16693/Daily-Rainfall-Gridding • Merging Individual Excel file for all 190 stations and interpolation • Assigning Grid • Replacing negative values using python Data gridding was performed using Minimum Surface Curvature with tensioning as proposed [6] . The point data were first pre-processed with 'blockmean', per every 0.25 °× 0.25 °spatial grid, to avoid spatial aliasing and remove redundant data [1] . This step allowed to average datasets per defined grids, maintaining spatial signatures within each grid. A tension factor of 1 was used in gridding, which works to suppress undesired oscillations and false maxima/minima seldom generated by MSC. The MSC with set tension was iterated on the daily data till convergence was met, which is e −4 of the root mean square deviation of the data from a best-fit plane.
In place of the error fields, we provide the results of an accuracy assessment and validation which was done using 19 gauge station results to test the accuracy of the output data at these specific locations. A summary of these results for the seasons is computed for the ratio and correlation of the observed gauge data to the output is provided in Tables 3 and 4 .

Ethics Statements
The work outlined above did not involve human subjects for which relevant informed consent was required.
There were no animal experiments, so compliance with any guidelines was irrelevant. Primary data used for this work was not obtained from any social media platform but empirical data obtained from a public institution.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability
Gridded Daily Rainfall Data for Ghana for the period 1960 -2015: Approach and Validation Process (Original data) (Mendeley Data).